Preprocessing of Web Logs
نویسنده
چکیده
Today’s real world databases are highly susceptible to noisy, missing and inconsistent data due to their typically huge size data and their origin from multiple, heterogeneous sources. Hence, pre-processing of data is necessary to help improve the quality of data and consequently the mining results. There are number of data pre-processing techniques. In this paper, we would like to discuss two different approaches for data preprocessing one based on XML and other based on text file. But the basic algorithm and steps involved in pre-processing are considered same for both the approaches.
منابع مشابه
An Efficient Algorithm for Data Cleaning of Web Logs with Spider Navigation Removal
The World Wide Web is growing massively larger with the exponential growth of websites providing the user with heaps of information. Text files called as web logs are used to store the clicks of a user whenever a user visits a website. Web usage mining is a stream of web mining that involves the applications of mining techniques to be applied on the server logs containing the user clickstreams....
متن کاملUser Interest Level Based Preprocessing Algorithms Using Web Usage Mining
Web logs take an important role to know about user behavior. Several pattern mining techniques were developed to understand the user behavior. A specific kind of preprocessing technique improves the quality and accuracy of the pattern mining algorithms. The existing algorithms have done the preprocessing activities for reducing the size of the log file and to identify the number of unique users...
متن کاملمقایسه وبلاگ های کتابخانه ها و کتابداران ایرانی با وبلاگ های برتر کتابداری؛1385
Introduction: Web logs are the evident tools for the librarians. There are three main ways for applying web logs in librarianship fields, as follows: personal use by librarian to upgrade their personal information, as a source of information in case of libraries, and for their services. The aim of this research is to comparison between Iranian libraries and librarians, and superior librarianshi...
متن کاملWeb Usage Mining: users' navigational patterns extraction from web logs using ant-based clustering method
Web Usage Mining is the process of applying data mining techniques to the discovery of usage patterns from data extracted from Web Log files. It mines the secondary data (web logs) derived from the users' interaction with the web pages during certain period of Web sessions. Web usage mining consists of three phases, namely preprocessing, pattern discovery, and pattern analysis. In this paper, w...
متن کاملAn Efficient Algorithm for Data Cleaning of Log File using File Extensions
World Wide Web is a monolithic repository of web pages that provides the Internet users with heaps of information. With the growth in number and complexity of Websites, the size of web has become massively large. Web Usage Mining is a division of web mining that involves application of mining techniques to web server logs in order to extract the behavior of users. A Web Usage Mining process com...
متن کاملتشخیص ناهنجاری روی وب از طریق ایجاد پروفایل کاربرد دسترسی
Due to increasing in cyber-attacks, the need for web servers attack detection technique has drawn attentions today. Unfortunately, many available security solutions are inefficient in identifying web-based attacks. The main aim of this study is to detect abnormal web navigations based on web usage profiles. In this paper, comparing scrolling behavior of a normal user with an attacker, and simu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010